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ABSTRACT 

This paper examines, both theoretically and 
empirically, three measures of segregation, vith the empirical focus 
on school segregation. The first measure is based on the absolute 
deviation of the racial composition of a school from that of the 
school district, the second is based on the square of that deviation, 
and the third is derived from information theory. The purpose of this 
paper is to examine and compare the properties of these three 
measures in terms of how useful they are both as descriptive devices 
and as indicators of appropriate policy actions. Separate discussions 
of the theoretical nature of each index are accompanied by summaries 
of their calculated values based on a sample of school districts. 
Several arguments are given for preferring the information theory 
measure: it incorporates the notion of diminishing marginal payoff to 
desegregation; it depends on the entire distribution of students by 
race across schools; it may be interpreted as a measure of 
association between race and school assignment; it can be 
meaningfully aggregated; and, once aggregated, it can be decomposed 
into ••between" and "within«» components. Its main drawbacks are that 
it is somewhat more complicated to calculate and that its 
interpretation is not as easily grasped intuitively. The use of any 
of the three indexes presented here as a policy aid would be 
substantially better than subjective judgment. (Author /JB) 
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ABST^w^CT 



Several aicernacive suggestions for methods of measuring segregation 
have appeared in the literature. This paper is an examination, both theo- 
retical and empirical, of three measures of segregation, with the empirical 
focus on school segregation. The first measure is based on the absolute 
deviation of the racial composition of a school from that of the school 
district, the second is based on the square of that deviation, and the 
third is derived from information theor The purpose of this paper is 
to examine and compare the properties of these three measures in terms 
of how useful they are both as descriptive devices and as indicators of 
appropriate policy actions. Separate discussions of the theoretical 
nature of each index are accompanied by summaries of their calculated 
values based on a sample of school districts. 

Several arguments are given for preferring the information theory 
measure: it incorporates the notion of diminishing marginal payoff to 
desegregation; it depends on the entire distribution of students by race 
across schools; it may be interpreted as a measure of association between 
race and school assignment; it can be meaningfully aggregated; and, once 
aggregated, it can be decomposed into "between" and "within" components. 
Its main drawbacks are that it is somewhat more complicated to calculate 
and that its interpretation is not as easily graspod intuitively. 

The use of any of the three indexes presented here as a policy aid 
would be substantially better than subjective judgment. Moreover, if the 
costs cf implementation and of gaining acceptance are not too great, then 
the information theory index appears to be the most appropriate measure of 
school segregation. 



AN INVESTIGATION OF ALTERNATIVE MEASURES 
OF SCHOOL SEGREGATION 

INTRODUCTION 

Several alternative suggestions for methods for measuring desegre«» 
gat ion have appeared in the literature* Excellent reviews of most of 
this literature appear in Taeubcr and Taeuber (Appendix A) and in Duncan 
and Duncan* This paper is an examination, both theoretical and empirical, 
of three measures of desegregation, with the empirical focus on school 
desegregation. The first measure examined, the dissimilarly index, is 
discussed in the two sources cited above and is based on the absolute 
deviation of the racial composition of a school from that of the school 
district* The second measure is referred to here as the segregation 
index and is based on the squared deviation. The third measure investi* 
gated derives from information theory and has been suggested for this 
use by Theil and Finizza. The major purpose of this paper is to examine 
and compare the properties of these three measures in terras of how useful 
they are as both descriptive devices and indicators of appropriate policy 
actions. 

Part I of this paper contains a separate discussion of the theoretical 
nature of each index and includes empirical calculations. The data used 
for these calculations are a subset of the information collected by DHEW 
from public elementary and secondary schools and school dit;tricts in the 
fall of 1972.^ The sample was chosen in order to eliminate those school 
districts for which the issue of school desegregation is not meaningful. 
It includes all school districts surveyed in 1972 for which each of the 
following were true in that year: 



2 

(1) Either the district contained more than 6 school campuses 
or at least one grade was taught at more than one campus* 

(2) At least 5 percent of the student population was minority* 

2 

<3) At least 5 percent of the student population was nonminority* 

Since the original DHEW survey wa^ based on a random sample of school 

districts » with different sampling rates for different size strata, the 

universe projections that are possible using the entire survey are not 

3 

reasonable based on our sample. This selection resulted in a set of 

2,393 districts, approximately 20 percent of all school districts in 

the country. Almost half of these were in the 17 southern and border 
4 

states, since minority students are relatively overrepresented in those 

states. While these 2,393 districts contain only 55 percent of the total 

national public school enrollment, they include more than 88 percent of 

5 

all enrolled minority students* 

Part II of this paper contains a comparative discussion of the three 
indexes and Part III presents conclusions and additional comments* 



I* ALTERNATIVE MEASURES OF SCHOOL SEGREGATION 



A* Dissimilarity Index (D) 

The first index we will consider was originally developed for the 

purpose of describing residential segregation. It has since been applied 
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to the study of school segregation as well as to other topics* The numer- 
ator of the dissimilarity index, which we shall call D^, is defined as 
simply the sum of the absolute deviations of the racial composition of 
the schools from the overall racial composition of the school district: 



ERLC 



where and are, respectively, the total enrollment and percent 
minority of the ith school, and where p is the percent minority of the 
district. An implicit rationale for this measure is that the contribu- 
tion of the ith school to the "badness" of segregation is proportional 
to the absolute difference between p^ and p. 

The index of dissimilarity (D) is then derived by dividing the value 
of by its maximum. This maximum will occur in a totally segregated 

o 

system and is given by'" 

D « I T (p - 0) + I T (1 - p) 

«= p(# of nonminority students) 

+ (1 - p)(# of nonminority students) 
= p(l - P)T + (1 - p)pT 

« 2Tp(l - p) . (2) 



Dividing by therefore gives an index that ranges from 0 to 1 for 

9 

any given school district; 



IT Ip^-p 



i 



2Tp(l - p) 



(3) 



An important characteristic of D is that its value is not dependent 
on the overall distribution of students by race but only on the numbers 
of students in schools with less than and those with greater than the 
district-wide proportion of minority students. This can be seen by 



decomposing as follows: 
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i) 



n 
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ST. + 




(A) 



The first bracketed term on the right-hand-side of (4) is simply the 
difference between the numbers of students in th wo groups of schools, 
while the second bracketed term is the difference between the numbers of 
minority students in the two groups of schools."''^ The value of D is 
unaffected by transferring students between any schools within each group; 
only by transferring them across the two groups will D change. Thus, D 
is independent of assignment among those schools for which p^<p or among 
those for which p^-p and is completely determined by the total numbers 
of minority and nonminority students in each of the two groups of schools. 
Alternatively, one can say that the payoff criterion implicit in D is 
linear (as opposed to the quadratic payoff criterion implicit in the 
second index to be discussed below). An important effect of this linearity 
is that the payoff (measured by changes in the value of D) is the same for 
bringing a particular school x percentage points closer to the overall 
racial composition of the district, regardless of how far away from that 
composition the school was originally. Since it is often assumed that 
achieving a given "amount" of desegregation is "easier" the more segregated 
are the schools to begin with, the use of D as a policy variable may not 
provide the desegregation incentives desired: if this assumption is valid, 
then the payoff should be nonlinear in the sense that a given "amount" 
of desegregation is rewarded more for initially more segregated districts. 
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As a measure of segregation, Che dissimilarity index has two very 
appealing features. First, it is the easiest to compute of all indexes 
discussed here. This characteristic derives from the fact that the only 
disaggregated information required is the numbers of minority and non- 
minority students in the two groups of schools identified above. Second, 
D has a straightforward intuitive interpretation since it equals the 
proportion of minority (or non-minority) students who would have to be 
transferred in order to achieve the same racial composition in all schools. 

Furthermore, a method of decomposing the value of D on the basis of other 

11 

attributes is available. In addition, a convenient interpretation may 
be attached to the weighted sum of absolute deviations given by (1) taken 
as c\ percent of total student enrollment, or 



D , 

-il . i IT 
T T i i 



Pi - P 



This quantity is the minimum percent of the total student body who would 

have to be involved in two-way minority-nonminority trades between schools 

In order to achieve racial balance and has been called the replacement 
12 

index. 

Table 1 displays the distribution of values of D across districts. 
The data sample used is the one described above in the Introduction and 
results are presented separately for southern school districts. Looking 
at the distributions of school districts across values of D, one sees 
little difference in the degree of segregation between the two regions. 
This is surprising, since most indications are that more school desegre- 
gation has occurred in recent years in the South than elsewhere. However, 
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a different picture emerges If one compares the distributions of students 
across values of D for the two regions: substantially large percentages 
of students, especially of minority students, are in relatively segregated 
school systems outside the South. The main reason that different conclu- 
sions are reached by looking at the two distributions stems mainly from 
the fact that the nonsouthern districts include more large school districts 
that are relatively segregated than do the southern districts. This is 

illustrated by the data in Table 2, which is taken from the table presented 

13 

in the Appendix. While more than 96 percent of the students (and 98 per- 
cent of the minority students) in the nine largest nonsouthern districts were 
enrolled in school systems with values of D greater than 2/3, this was 
true of only 30 percent of the students (and 66 percent of the minority 
3;:udents) in the eleven largest southern districts listed. (Kote also 
ihat more than 73 percent of the minority students but only AO percent 
of the nonminority students in these largest twenty districts were out- 
side the South.) 

B. %§regation Index (S) 

An interesting feature of the segregation index (S) is that it was 

developed separately and independently by two groups each using different 

, 14 ^ 

rationales, one statistical and the other in terms of policy goals. Three 
conceptual bases for S will be discussed here in order to shed additional 
light on its interpretation. 
!• S as a Policy-Goal Measure 

Assume that the goal of scUool desegregation is to avoid racial 
isolation and that this goal is achieved for each student in proportion 
to the percent of students belonging to the other racial group in the 
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BEST copy flVAIlABlE '^'^''^ ^ 

total EnroXXmenc, Minority Enrollment, and Values of D 
for the 20 Largest Districts in the Sample 



District Name 


Total 
Enrollment 


Minority 
Enrollment 


D 


South: 








Broward Co»» Fla. 


128,889 


31,640 


.31 


Dade Co. » Fla. 


241,809 


124,870 


.52 




1 1 AAA 


"it 1 f\t\ 
Of ,XUU 


.33 


Hillsborough Co., Fla. 


106,294 


27,196 


.18 


Baltimore City^ M. 


186,600 


129,250 


.82 


Montgomery Co., Md. 


126,912 


12,799 


.29 


Prince Georges Co., Md. 


161,961 


42,935 


.61 


St. Louis City, Mo. 


105,617 


72,985 


.90 


Memphis City, Tenn. 


138,714 


80,403 


.86 


Dallas, Texas 


154,580 


76,366 


.70 


Houston, Texas 


225,410 


127.128 


.73 


Totals 


2,744,136 


735,476 




Non-South : 














AO 


San Diego, Cal. 


124,604 


32,790 


.53 


Chicago, 111. 


557,141 


384,149 


.80 


Detroit, Mich. 


276,655 


192,259 


.74 


New York City, «.Y. 


1,125,449 


724,954 


.67 


Cleveland, Ohio 


145,196 


87,007 


.88 


Columbus, Ohio 


106,676 


31,825 


.70 


Philat'elphia, Pa. 


282,965 


183,424 


.78 


Milwaukee, Wis. 


128^734 


43,665 


.76 


Totals 


3,368,127 


2.007,351 




Grand Totals 


6,112,263 


2,742,827 
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same school. In other words, the contribution of each minority child 
towards this goal equals the proportion of nonminority children attend- 
ing the same school. Averaging this criterion over all minority children' 
Chen yields 

I T^p^d - f^) Z T^Pj^d - Pj) 

i 

vhere T p equals the number of minority students in the 1th school and 

(1 - p^) is the proportion of nonminority students attending that school. 

This quantity will be maximized when p^ P for all i, i.e. , when all 

schools have the same racial composition. This maximum value equals 

15 

(1 - p), the district-wide percent nonminority. We therefore define 
the segregation index to be one minus the value of (5) taken as a percent 
of its maximum possible value, or 



DI 5 - 

n , 1 „ ,. .. 

p " Tp(l - p) 



1 . _^ = 1 - L-^, ^. (6) 



In this context, the value of S may be interpreted as the amount of 
"exposure" between minority and nonminority students that has not been 
achieved within the schools relative to the maximum amount possible. 
2. S as a Mean-Square-Deviation Measure 

Assume that the goal of school desegregation is to avoid deviations 
from the mean racial composition and that the "costs" of such deviations 
increase with the square of the deviation. The mean-square-deviation 
(MSD), averaged over all schools and weighted by school enrollments, is 
then 

MSD = Z T (p - p)^ , <7> 
i 



ERIC 
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which can also be written as 

MSD » Tpd - p) - E T^p^d - p^) . (b) 

The maximum value of the expression in (8) occurs when schools are 
totally segregated and is given by 

Tpd - p)^ + T(l - p)p^ = Tpd - p) . (9) 

The first term on the left-hand-side of (9) is simply the number of 

minority students in the district (Tp) times the contribution of the 

all-minority schools to MSD, since Pj_ * ^ each of these schools. 

The second term is likewise the number of nonminority students [Td - p)3 

2 

multiplied by the contribution of their schools (p ) in which p^ » 0. 
Thus we define the index as MSD (7) divided by its maximum value (9), 
or 

t T (p - p)^ 
i ^ ^ 

TpCl - p) ' 

which can also be written as: 

Z T^p^Cl - p^) 

I . ^ - s . (10) 

Tpd - p) 

Thus, minimizing the value of the MSD index is exactly equivalent to 

minimizing the value of S. 

3, S as a variance-Accountability Measure 

Consider a binomial race variable that equals 1 if the jth 
student in the ith school is minority and 0 otherwise. Then the appro- 
priate hypothesis test for equality of racial composition across schools 



11 

can be derived from analysis of variance. The expectei value of R^^ is 
p and its variance can be decomposed as follows: 

o ? 2 

iKR., - p) = sKR - p.) + s T^(Pi - p) • ai> 

ij ij ^ i 

The first term on the right-hand-side of (11) is the "within samples" 
variation and can be interpreted as the variance "attributable to 
desegregation" since it measures the mean-square-deviation of R^^ within 
schools. The second term can be interpreted as the variance "attributable 
to segregation" since it measures the mean-square-deviation of R^^ between 
schools. In terms of S. the docomposition of (11) can be rewritten as 

^:(R.. -P)" = Tp(l-p)(l-S) + Tp(l - p)S . (12) 
ij ij 

Since Tp(l - p) is the total va-iance in che system, S can be interpreted 

. 17 

as the percent of the total variance attributable to segregation. 

To clarify this interpretation, consider the following measure of 
association between the binomial color variable and the school to which 
a student is assigned: 



JL. ^^^^ 



T(L - 1) 

where is the Pearson chi-square computed from a 2xK contingency table 
(K is the number of schools in the district) and L is the smaller of the 
nuKwer of rows and columns in that table. * is often called Cramer's 
statistic and should not be confused with the contingency coefficient. 
The value of 4« must lie between 0 (complete independence) and 1 (perfect 
association)."^® Since we are constraining the number of racial/ethnic 
groups to be 2 (minority and nonminority) , and since it is only meaningful 
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to discuss desegregation when there is more than one school, L must always 
equal 2. Therefore: 

2 19 
However, for our purposes, we can write x as follows: 



i 



-f- - - ' ^ 



(1 - P)Ti 



Rearranging and combining tenns yields 

I T,(p, - p)^ 
2 i ^ ^ 
^ " P(l - P) ' 



which, using (10) above, reduces to 



TS 



Thus, 



= S 



(lA) 



2 

Although <}> may not be conveniently interpreted as the proportion of the 

variance in one variable explained by the other, it does provide us with 

a measure of association between race and school assignment that can be 

2 

compared across different school districts. As with MSD, minimizing 4> 
is equivalent to minimizing S, so that the two amount to the same desegre- 
gation criterion. 

Table 3 shows the distribution of districts, schools, and students 
in the sample by values of S and by region. The notable difference between 
Table 3 and Table 1 (values of D) is that districts tend to be more heavily 
clustered under lower values of S than they were for D. This is not terribly 
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surprising, since usinjj the mean-square-rle-.'iation should more heavily 
weight divergences (nnd, thrir^afore, Kt'8r«?.''>v-'r.ion) than using the average 
absolute deviation. It ir> important to note that our conclusions with 
respect to South /non-South oonrtarfsonst are exactiv the same as above: 
namely, although the distrrhution of districts tonds to indicate about 
the same amount of segregation in the two regions*, the distribution of 
students clearly shows more segregation outside the South, 

^ • Information Theory index (h) 

Information theory provides us with a technique for measuring the 

20 

degree of association between two qualitative or categorical variables. 
Consider the joint probability distribution given by P(A,B) where A and 
B are noi'.quantxfiable events. The marginal and conditional distributions 
are giver by P(A), P(B), PCAjB), and PCBjA). For our purposes, we define 
A as the school chat an individual student attends and b as the minority/ 
nonminority st.itus of the student. Information theory then defines the 
average joint uncertainty c A and is as 

H(A,B) = - ):XP(A,,B,) log P(A.,B ) . 

i j 1 J 

Letting A. represent assignment to school i (i = 1, K), B- represent 

i 

minority status, and B2 represent nonminority status, we can write 



H(A,B) = - 



^-.^i, ^ih . ^^-Pi>^i, ^^-Pi^^i 



^ log + ^ log 



. (15) 



iL. 

The average marginal gnd conditional uncertainties are similarly defined 
and exnressed as follows: 
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H(A) « - IP(A^) log P(A^) 



BEST COPY AVAILABLE 

- I ^ log ^ ; 



H(B) « - J:P(B ) log P(B,) = p log ~ + (1 - p) log ^ 
J 3 3 P 



(1 - p) 



(17) 



h(a|b) = - j:ep(a.,b.) log P (A, |B.) 



PA 



(1 - p^)\ 



Pi + (1 - Pi) log (i r 



; (18) 



H(B]A) « - 2EP(A^,B ) log P(B^ |A^) 
ij 



i 



1 1 1 

p^ log — + (1 - p^) log 



(19) 



(1 - Pi)j * 

The marginal uncertainty H<B) is the average prior amount of uncertainty 
about B over ail possible cases, while the conditional uncertainty H(B|A) 
is the average amount of uncertainty concerning event B given knowledge 
of event A. The average relative reduction in uncertainty about B resulting 
from knowing A can then be written as 



H » 



H( B) - H(B'A) 
H(B) 



(20) 



Certainly H(B) must be no less than H(B,A), since our uncertainty about 
B is reduced if we have knowledge of A so long as there is any relation 
at all between the two events. Thus, H 1, with equality holding only 
when A and B are independent. H can therefore be interpreted as the 
relative reduction in uncertainty about the racial status of a particular 
student given that we know which school that student attends. The greater 
the value of H, the more certain we would be in predicting the race of any 
student in a particular school. H is therefore a measure of segregation: 
the larger its value for a particular school district, the more racially 
segregated are the schools of that district. 
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We have not yet justified defining our measure as the relative reduc- 
tion in uncertainty about B given A rather than the relative reduction 

in uncertainty about A given B. Consider the following synnnetric measure 

22 

of association between A and B, again from information theory: 

2 . H(A) 4. H(B) - H(A,S) (21) 
^ miu[H(A),H(B)3 

The numerator of y is called the "expected mutual information" and can 

23 2A 
be shown to be nonnegative. In fact, it is true in general 

H(A) + H(B) - H(A,B) = H<B) - H(b1a) . (22) 

Furthermore, so long as the number of schools (K) is, greater than 1 and 

no single school contains more than one-half the students, then, taking 

logarithms to the base 2 (as suggested by Theil and Finizza), we have the 
25 

result that 

H(A) > 1 ^ H(B) . 

The denominator of y is simply H(B), which, together with (22), implies 

that Y * H. Thus H can be interpreted not only as a measure of the 

relative reduction in uncertainty but also as a measure of association 

2 

highly analogous to a squared coefficient of correlation (p ). The 

2 2 

analogy is particularly strong in that both Y and p indicate how much 

of a reduction in uncertainty/variation in one particular variable can 

26 

be achieved by knowing another. 

The relevant school segregation index is therefore 



1 . 1_ _ . !i log ^ + (1 - P,) log JiTJ^ 



p log I + (1 - p) log (i-- p) - 2 



H » — ~ — ' ■■ " 1 I 

p log - + (1 - P) log (1 . p) 

"0 



.(23) 



ERIC 



^' BEST COPY AVAIIABIE 

Theil and Finizza have directly derived this index as a measure of school 
desegregation. They offer the interpretation of 



Pi ^ + (1 - Pi) log -g—y 



Pj "3. - - 

as the "racial entropy" of the student body of the ith school. Analogously, 
then, H(B) can be termed the racial entropy of the district and HCBjA) the 
average school racial entropy. 

Theil and Finizza also demonstrate that this type of index can be 
easily aggregated over large units. Switching to their terminology and 
notation for the moment for ease of presentation, we consider a set of G 
school districts (such as a city) and define the following "entropies" 
using the subscript g to denote values for the gth district: 

1 1 
School: » P^ log — + (1 - p^) log j^—^ . 

District: « Pg log ^ + (1 ~ p^) log ^ p ) • 

g S 

City: E = p log ~ (1 - p) log ^^"^ p^ • (2A) 

T 

— i 
Average district: ** ^ tT" ^i * 

^ ieg 8 

Average city: E « ^ " ^T^g* 

Unsubscripted values of p and T are now calculated over the entire set 

27 

of G school districts. Note that, for the gth school district, E^ is 
the same as H(B) and E is equal to H(b|a) as defined above in (17) and 
(19). The aggregation over the set of districts is straightforward: to 
obtain the value of H(B |A) for the city (E) , one simply takes a weighted 



BEST COPY AVAILABLE 
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average of the values of H(B,A) for each district (E ) with weights 

' 8 

corresponding to the proportion of the city*s enrollment in each district. 

As well as providing a convenient method of aggregation, this formula- 
tion also allows us to make an interesting decomposition of E. Theil and 
Finizza show that 

T T _ 

E = S E„ - I -a I , (25) 



8^ ^ 8^ « 



where 



^ ieg g 



Pi (1 - P^) 



Pi r "* a - pi 

8 8 



(26) 



The quantity I is known in information theory as the average "expected 

& 

information of the message that transforms the proportions (p , 1 - p ) to 

o o 

28 

a second set of proportions (p^, 1 - P^^*^* other words, if we already 

know the percent minority of the gth district's student body (p then 

I defines the expected information content, on the average, of a message 

that tells us the percent minority of the ith school in that district (p^) . 

Since T is a measure of the extent to which the racial composition of 
8 

the gth district differs from that of one of its schools, then the second 
term on the right-hand-side of (25) may be interpreted as a weighted average 
of the degree of racial segregation in each district. The first term in 
(25) is a weighted average of each district's total "entropy" and may be 
interpreted as a measure of the racial composition of each district relative 
to that of the city as a whole. Thus, (25) represents a decomposition of 
the city's average "entropy" into a component representing "between district" 
segregation and one representing "within district" segregation. This clearly 
provides a potentially fruitful method for investigating the currently 

» > 
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controversial issue of cross-district school desegregation. lining r.he 
decomposition of (25) ♦ we can determine not only how segregated a set 
of school districts is, but also to what extent aifferences in the racial 
compositions of the districts contribute to the segregation of the overall 
system. 

The distributions of districts, schools, and students across differ- 
ent values of H is given by region in Table 4. The notable point about 
these distributions is their remarkable similarity to the distributions 
across S of Table 3. Virtually any conclusion one would draw from the 
data of Table 3 would be identical if Table A were used. Some of the 
reasons for this similarity will be discussed in the next section. 



II. COMPARISON OF MEASURES 

There are some important qualifications that must be kept in mind 
when interpreting actual values of these indexes as measures of the 
extent of school desegregation. Each index is computed here on the 
basis of the entire student body across the district. This means that 
two implicit and erroneous assumptions must be recognized: (1) that 
students can be transferred between grade levels as well as between 
schools, since no account is taken of the grade span offered at each 
school; (2) that a particular student can be transferred to any school 
in the district just as "easily" as to any other. Assumption (1) is 
necessary even if one is only considering how much desegregation has 
been achieved within a particular district relative to what that district 
could accomplish. However, it is likely to create serious problems of 
interpretation in only two instances: i? the district contains only a 
few schools, or if either the racial composition or actual degree of 
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desegvegatlon differs substantially between sets of schools offering 
different grade spans (e.g., between elementary and secondary schools). 
Because we have excluded the very small districts from our sample, the 
effect of the first problem has been somewhat alleviated. Alehough we 
have not dealt explicitly with the second case, there is no particular 
reason to believe that it causes much of a problem except* perhaps, as a 
result of different dropout rates for older students. 

Assumption (2) is not necessary unless one wishes to compare index 
values across districts as measures of relative desegregation efforts. 
In that case, account must be taken of the factors influencing relative 
costs of desegregation in different districts. Some of these factors 
are racial residential segregation, location of and distances between 
schools, and school capacities relative to population densities. No 
notion of these cost factor? is included in the definition of any of 
the indexes. The closest we come to dealing with these problems here 
is in recogniEing that the incremental cost of desegregation rises with 
the absolute level of desegregation. This implies that our choice of a 
measure for policy purposes should be one whose marginal payoff is a 
decreasing function of the level of desegregation. As we shall see below, 
both S and H exhibit this characteristic, while D does not. 

The three indexes discussed here have several characteristics in 
common. First, they are all perfectly symmetrical with respect to the 
two racial/ethnic groups. Second, they are all nonconcave functions 
of the racial mix in each school. This insures that optimization on any 

one of the indexes will yield the most homogeneous possible racial 

29 

composition of the schools. The linearity of D, however, distinguishes 
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it from the other two indexes, since the incremental payoff per student 
in terms of D is the same for a particular school once it is known whether 
that school^ 8 p^^ is less than or greater than p. Figure 1 shows the 
marginal payoff per student in a particular school for H and S over 
different values of p^. The two values have been plotted using differ- 
ent scales (since their luaKimum values are not the same) to show that the 

31 

shapes of the two payoff functions are very similar. This is not 
surprising^ since both of these indexes are measures of association between 
student racial affiliation and school assignment « For this reason^ we 

would also expect any set of calculated values of S and H to be highly 

32 

correlated, as, in fact, they turn out to be. 

There are other possible applications for these types of indexes 

within the context of school segregation. They can and have been used 

to examine issues of school faculty segregation by race as well as racial 

segregation of students between classrooms within grade level « Both of 

these issues have been very important in the South, first, because faculty 

desegregation has been interpreted by the courts as a necessary step in 

eliminating dual school systems and, second, because instances have been 

uncovered of southern systems that, after having desegregated their schools, 

effectively resegregate students by classroom. Table 5 displays simple 

correlation coefficients between the three indexes computed, for the sample 

of districts described above, on the bases of faculty desegregation and 

classroom desegregation for grades 3, 6, 9, and 12. The means and standard 

deviations of the index values are also presented. 

The indexes D„, S«,, and are straightforward extensions of Dj S, 
r r r 

and U, with the focus now on the numbers and racial composition of faculty 
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meabers la different schools. Extending the indexes to measure classroom 
segregation is slightly more complicated because of aggregation problems. 
As noted above, aggregation is no problem at all when using the information 
theory index. However, wxtn both D and S, the issue arises as to which 
racial composition each classroom should be compared against — that of the 
school or that of the entire district. Using the former creates the 
problem of how to aggregate over all the schools in the district, while 
using the latter implies that students can be transferred between any 
two classrooms (of their grade level) in the district regardless of which 
school that classroom is in. We present heie results for only one class- 
room segregation index other than H: S^, S^, S^, and S^^ computed 
analogously to S using the district-wide percent minority for the appro- 
priate grade level. The correlation coefficients between the S and H 
measures of classroom segregation reconfirm our theoretical claim that 
these two indexes tend to measure the same thing. 

The high correlations between D and S and between D and H are 
somewhat surprising, since the distribution in Table 1 seems to be very 
different from those in Tables 3 and 4. However, if we note the fact 
that the variances of S and H appear to be somewhat smaller than that of 
D and that their ranges are lower, it is reasonable to assert that the 
three indexes do, indeed, move together linearly across districts. This 
can be confirmed by scanning the listing of index values for large districts 
in Table A. 2 of the Appendix. Not only is the value of D always higher 
than those of S and H, but, while D never falls below .1 for this set of 
districts, S and H frequently do. Note also that D^. is not at all highly 
correlated with either Sp or Hp. 

ZD 
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III. CONCLUSION AND COMMENTS 



The evidence we have shown here leads us to conclude that» among 
the three segregation Indexes discussed, the one derived from information 
theory (H) is the most useful. To appropriately qualify this statement, 
we will now consider the reasons for this choice* 

Three reasons can be stated for preferring either S or H as a measure 
of segregation over D. First, both S and H incorporate the notion of 
diminishing marginal payoff to desegregation* This is useful in both 
a descriptive and a policy sense since there is good reason to believe 
that the cost of additional desegregation rises with the level* It is 
also relevant to incorporate this notion as a policy incentive since more 
weight is thereby given to desegregation efforts by the most segregated 
districts. Second, we have seen that S and H both depend on the entire 
distribution of students across schools rather than, as D does, on the 
numbers of students in schools with less than and those with more than 
the district-wide percent minority. Finally, although the dissimilarity 
index has a convenient and appealing interpretation, so do S and H. There 
seems to be no particular reason for preferring one of these interpretations 
to another. The ease of calculating D is an additional point in its favor 
and certainly relevant, although computers can just as easily handle one 
index as another. 

Why, then, do we prefer H to S? Again, three arguments are put 
forth. First, we have seen that H can be conveniently and meaningfully 
aggregated, whereas the proper aggregation procedure for S is somewhat 
ambiguous. Although this point is not relevant when considering simply 
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the level of school segregation within a district (or the level of class- 
room segregation within a school) , it becomes very important in issues 
such as cross-district desegregation or state-by-state comparisons of 
desegregation eflui-ts. Second, we have also seen that, for certain 
issues, a convenient decomposition of H is available, while this is not 
true of S. Finally, although both S and H are measures of association 
between racial/ethnic affiliation and school assignment, the interpreta- 
tion of H is a bit more precise because of its analogy with the squared 
correlation coefficient. This last reason is a rather marginal one, 
since both S and H can be interpreted as the percent of one thing 
"attributable to" another. However, it should be noted that, unlike S, 

the definition of the information theory measure H would allow us to 

33 

extend it to the case of more than two racial/ethnic categories. 
Although this has not yet been applied to the issue of school segrega- 
tion, it is potentially useful in areas with more than one predominant 
minority group, such as Blacks and Chicanes in the Southwest. 

In addition to the usefulness of indexes as descriptive devices, 
they can have important applications as policy tools. Some examples 
relevant to the issue of school segregation are worth mentioning. 
Segregation indexes can be an informative aid in enforcing civil rights 
legislation. Indexes can be used to identify where problems exist as 
well as where progress has been made. In addition, appropriate indexes 
can be used as funding criteria for certain expenditure programs. The 
Emergency School Aid Act of 1972 is a case in point. This legislation 
was developed to provide financial assistance to desegregating school 
districts, and one of the explicit funding criterion was the extent to 
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which minority isolation of students was reduced. Unfortunately, minority 
group isolation was defined by the bill to refer to any school whose 
enrollment was greater than 50 percent minority* This ruled out the use 
of a general segregation index as a funding criterion, although it would 
prevent an incentive for resegregation in districts which were, overall, 
more than 50 percent minority • Nevertheless, it provides a good example 
of the type of policy \ises to which such indexes can be put* 

The uses of indexes similar to the ones presented here are not, of 
course, limited to the issue of school segregation* The concepts embodied 
in this paper are directly transferable to the issue of residential 
segregation and, indeed, to any issue involving the distribution of a 
two-category (binomial) variable across some specified units, such as 
the distribution by race and by sex across occupations. 

Finally, it is important to note that the two characteristics of 
the dissimilarity index that have made it so appealing — its ease of 
computation and its convenient interpretation — should not be dismissed 
lightly, especially given the realities of federal policy making. It 
is this trriter's experience that even slightly complex analytic tech- 
niques are very slow to gain acceptance within the government bureaucracy. 
Nevertheless, if the effort is to be made, it should be towards a useful 
and meaningful end. In conclusion, then, the use of any of the three 
indexes presented here as a policy aid would be substantially better than 
a seat-of-the-pants type of judgment. However, if the costs of implementa- 
tion and of gaining acceptance are not too great, we would opt for the 
information theory index as the most appropriate measure of segregation. 
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BEST COW AtfAllABlf 

^Xhese data are published in U.S. Department of Health, Education, and 
Welfare, Office for Civil Rights, Pirectory of Public Elementary and Secondary 
Schools in Selected Districts Enrollment and Staff by Racial/Ethni c 
(Washington, D*C.: U.S. Government Printing Office, 1973). 

2 

The term minority is used throughout this paper to refer to all 
persons who were classified in the DH£W survey as American Indian, Negro, 
Oriental, or Spanish-Sumamed American. All other persons were reported 
in a single category and are referred to as nonminority. 

3 

The sampling procedure used by DHEW resulted in all districts contain*- 
ing at least 3,000 students being surveyed while none of those with an 
enrollment of 300 were included. 

4 

Alabama, Arkansas, Delaware, Florida, Georgia, Kentucky, Louisiana, 
Maryland, Mississippi, Missouri, North Carolina, Oklahoma, South Carolina, 
Tennessee, Texas, Virginia, and West Virginia. 

5 

Almost all of the excluded districts were omitted because they were 
either too small <2,42A districts) or greater than 95 percent non-minority 
(3,211 districts). In only 28 of the surveyed school districts was the 
student body greater than 95 percent minority, and the only large district 
in this category was the District of Columbia. 

Karl E. Taeuber, and Alma F. Taeuber, Neproes in Cities (Chicago: 
Aldine Publishing Company, 1967), Appendix A. 

7 

See Farley and A. Taeuber, and Leslau ior its use as a measure of 
school segregation. Farley and A* Taeuber also compare school with 
residential segregation using this index. Amoni; other things, the 
dissimilarity index has been used to measure occupational segregation 
by sex. See the Council of Economic Advisors 1973 Report, Supplement 
to Chapter 4. 
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Schools for which p^ = p may arbitrarily be placed in either summation 
9 

Note that D can equivalently be expressed as 



M ^ W 



where M and W refer to numbers of minority and nonminority students respec- 
tively. Note also that D is perfectly symmetrical with respect to minority 
and nonminority students since its value would be unchanged if p^ and p were 
defined instead as the proportions of nonminority students. 
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Since D is symmetrical with respect to the two racial groups, it 
can also be written as 



2D 



(1 - P) 



j: T - E T 



Pi>P 



T,(l - P,) . 



T^(l 



11, 



'See Hallimau H, Winsborough, "A Note on the Decomposition of Indexes 
of Dissimilarity," Institute for Research on Poverty (University of Wisconsin- 
Madison) Discussion Paper No* 201-74 (Madison, Wisconsin: 1974), for the 
derivation and discussion. The sample he uses compares racial residential 
segregation with between- and within«*group income distributions « 
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Ueynolus Farxey and Karl Taeuber, '^Population Trends and Residential 
Segregation Since I960,'* Science , Vol. 159, No. 3818 (March 1, 1968): 956* 



13 

Tnree large districts co not appear in Table 2 because they were 
excluded from the sample: the District of Columbia, which is greater 
than 95 percent minority, and Baltimore County, Md. , and Fairfax County^ 
Va. , both of which «re less than 5 percent minority. 

14 It 
See Ira H. Cisin, "Statistical Indices of School Integration," 

Technical Memorandum 70-1, Social Research Group, George Washington University, 

for the first and George Pugh, "Criteria for Measurement of Integration Level," 

Paper 65, Lambda Corporation, Arlington, Virginia, for the second* The Pugh 

paper also contains a helpful discussion of several alternative measures, 

including the one described by Cisin. 

15 

S (like D) is perfectly symmetrical between the two racial groups 
and can be derived by averaging the percent minority in each school over 
all non-minority students. This average would then be 



T^P^d - P^) 
T(l - p) 



and its maximum value would be p. 
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A similar concept can be used to derived the dissimilarity index (D) 
using absolute deviations as the criterion. 
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One could perform a standard F test on the null hypothesis that 
P for all i using 



F = 



S/(K - 1) 



(1 - S)/(T - K) » 



ERIC 



where K is the number of schools in the district. In practice, however, 
this is a somewhat misleading test to perform, particularly for policy 
purposes, since T is almost always very much larger than K. Thus, very 
*^ slight deviations from racial balance will result in a rejection of the 
null hypothesis* 
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18 

See William Hays, S tat i stics (New York: Holt, Riuehart, and 
Winston, 1963), pp. 604-606, for a fuller discussion of the ^ coefficient* 

19 

Tills is the standard Pearson chi-*square statistic computed from a 
2xK contingency table using p^X^ and (1 - P^^'^i ^® expected number of 
minority and nonminority students, respectively, in the ith school. 

20 

See Hays, op* cit*, pp* 610-612, and Henri Theil^ Eco n o m i cs and 
Information Theory (Amsterdam: North-Holland Publishing Company, 1965), 
Chapters 1-3, for fuller discussions of this approach* 

21 

This formula amounts to the expected value of th<* quantity - log P(A^,B ) 
over all i and j. When the logarithm is taken to the base 2 <as we choose 
below to do), then this quantity equals the minimum number of "y^s-no" 
questions one would have to ask in order to determine the school and 
racial/ethnic affiliation of any particular student. In the language 
of information theory, it is the "information content" of the message 
containing both of these pieces of information about the student* For 
a univariate application of this concept to measures of industrial concen- 
traction, see Theil, op* cit*$ Chapter 8* An additional justification for 
using the logaritlmiic function is its additive properties. See Theil, op* cit.. 
Chapter 4. 

22 

See Hays, op. cit., p. 611. 
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24 



For a proof, see Theil, op. cit., pp. 34-35. 



See Theil, op. cit., pp. 49-50, for the proof. In his words, this 
result can be described as follows: "The expected mutual information is 
equal to the unconditional entropy [i.e., expected information content}, 
given the messages sent*" In equation (22), the left-hand-side represents 
the expected mutual information9 H(B\ the unconditional entropy, and H(B|a) 
the entropy conditioned on knowledge of A. 

^^If the proportion of students attending school k (Tj^/T) is no greater 
than 1/2 for all k, then 



- (Tj^/T) log2 (Tj^/T) > 1/2 for all k 

and 

H(A) = - Z (Tj^/T) log2 (Tj^/T) >^ K/2. 
k 



But (K/2) >^ 1 so long as there is more than one school. Therefore, 
H(A) >^ 1. The value of H(B) is solely determined by p and, taking 
logs to the base 2, has a maximum value of 1 and a minimum of 0. 
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26 

Measures of association between more than two categorical variables 
can also be derived from information theory. See Theil, op. cit.> pp. 5S-*59. 

^^Thus, T « I T , and p « E (T /T)p . 

g 8 g « 8 

28 

See Henri Thell and Anthony Finizza, "A Note on the Measurement of 
Racial Integration of Schools by Means of Informational Concepts," Jouvnal 
of Mathematical Sociology 1971, Vol. 1, p. 191. Note that the value of !« 
for the gth district is defined as <E - ) and is the same as H(B) - U(8|a} 
as defined above. ^ ^ 

29 

Since each index is defined here as a measure of segregation 
minimization of the indexes will result in racially balanced schools. 
Any of the three indexes could be redefined as one minus its current 
value without loss of its properties, in which case maximization would 
be the appropriate goal» 

30 

A graph like Figure 1 cannot be drawn for D independently of the 
value of p» Such a graph would simply be two straight lines, one rising 
from 0 at p. « 0 to its maximum where p^ ^ P and the other falling to 
zero at p^ » 1. 

31 

The same change of scale in Figure 1 could have been accomplished 
by taking logarithms to the base 16 for H , thereby making its maximum 
value also equal to •25« 

32 

See Table 5 below. 

33 

Applications of an information theory measure using more than two 
categories include measurements of the inequality of income and of 
industrial concentration. See Ann R. Horowitz, *^Trends in the Distribution 
of Fiimily Income Within and Between Racial Groups," in George M. von Furstenberg, 
et al. , editors, Patter ns of Racial Discrimination > Volume II? Employment and 
Income (Lexington, Mass.: D.C. Heath and Company, 1974), for the former and 
George J. Stigler, The Organization of Industry (Homewood, 111.: Richard D. 
Irwin, Inc., 1968), pp. 32-35 for the latter. Horowitz makes interesting use 
of the decomposition properties to compare black and white income distributions. 
Theil also suggests a wide range of applications. 
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APPENDIX 



TABLE A.l: School Districts and Enrollment 
in the 1972 Sample by Region 
and Size of District 



TABLE A. 2: Segregation Indexes for Districts 
in the X972 Sample Enrolling 
25,000 or More Students 
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